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Abstract 



Let Xq be an unknown M hy N matrix. In matrix recovery, one 
t-H takes n < MN linear measurements yi,...,yn of Xq, where yi = 

^ Tr(af Xq) and each is a M by matrix. For measurement matrices 

I— I with Gaussian i.i.d entries, it known that if Xq is of low rank, it is 

^ recoverable from just a few measurements. A popular approach for 

^ matrix recovery is Nuclear Norm Minimization (NNM): solving the 

convex optimization problem min ||A||* subject to j/j = Tr(a^A) for 
all 1 < i < n, where || • ||* denotes the nuclear norm, namely, the sum 
C\| of singular values. Empirical work reveals a phase transition curve, 

^ stated in terms of the undersampling fraction S{n, M, N) = n/ (MN), 

rank fraction p = r/N and aspect ratio /3 = M/N. Specifically, a 
curve 5* = 5*{p;f3) exists such that, if 5 > 5*{p;P), NNM typically 
succeeds, while if S < d*{p; it typically fails. 

An apparently quite different problem is matrix denoising in Gaus- 
sian noise, where an unknown M hy N matrix Xq is to be estimated 
5^ based on direct noisy measurements Y = Xq + Z, where the matrix 

Z has iid Gaussian entries. It has been empirically observed that, if 
Xq has low rank, it may be recovered quite accurately from the noisy 
measurement Y. A popular matrix denoising scheme solves the un- 
constrained optimization problem min \\Y — X||^/2 + A||X||*. When 
optimally tuned, this scheme achieves the asymptotic minimax MSE 
M{p) = limAT^oo infA suprank(x)<p-Af MSE{X, Xx)- 
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We report extensive experiments showing that the phase transition 
5*{p) in the first problem (Matrix Recovery from Gaussian Measure- 
ments) coincides with the minimax risk curve M{p) in the second 
problem (Matrix Denoising in Gaussian Noise): d*{p) = Ai{p), for 
any rank fraction < p < 1. 

Our experiments considered matrices belonging to two constraint 
classes: real M by matrices, of various ranks and aspect ratios, 
and real symmetric positive semidefinite N hy N matrices, of vari- 
ous ranks. Different predictions M{p) of the phase transition location 
were used in the two different cases, and were validated by the exper- 
imental data. 
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1 Introduction 



Let Xq be an unknown M by X matrix. How many measurements must we 
obtain in order to 'completely know' XqI While it seems that MN measure- 
ments must be necessary, in recent years intense research in applied mathe- 
matics, optimization and information theory, has shown that, when Xq is of 
low rank, we may efficiently recover it from a relatively small number of linear 
measurements by convex optimization [1-3]. Applications have been devel- 
oped in fields ranging widely, for example from video and image processing 
[4], to quantum state tomography [5], to collaborative filtering [1, 6]. 

Specifically, let A : M}'^^^ be a linear operator and consider mea- 

surements y = ^(Xo). If n < MN, the problem of inferring Xq from y 
may be viewed as attempting to solve an underdetermined system of equa- 
tions. Under certain circumstances, it has been observed that this (seemingly 
hopeless) task can be accomplished by solving the so-called nuclear norm 
minimization problem 

(Pnuc) niin ||X||* subject to y = ^(X) . (1) 

Here the nuclear norm ||X||^, is the sum of singular values of X. For example, 
it was found that if Xq is sufficiently low rank, with a principal subspace in 
a certain sense incoherent to the measurement operator A, then the solution 
Xi = Xi{y) to (Pnuc) is precisely Xq. Such incoherence can be obtained 
by letting A be random, for instance if ^(Xo)j = Tr(afXo) with G M™^" 
having i.i.d. Gaussian entries. In this case we speak of "matrix recovery from 
Gaussian measurements" [3]. 

A key phrase from the previous paragraph: 'if Xq is sufficiently low rank'. 
Clearly there must be a quantitative trade-off between the rank of Xq and 
the number of measurements required, such that higher rank matrices re- 
quire more measurements. In the Gaussian measurements model, with N 
sufficiently large, empirical work by Recht, Xu and Hassibi [7, 8], Fazel, 
Parillo and Recht [3], Tanner and Wei [9] and Oymak and Hassibi [10], doc- 
uments a phase transition phenomenon. For matrices of a given rank, there 
is a fairly precise number of required samples, in the sense that a transition 
from non recovery to complete recovery takes place sharply as the number of 
samples varies across this value. For example, in Figure 1 below we report 
results obtained in our own experiments, showing that, for reconstructing 
matrices of size 60 by 60 which are of rank 20, 2600 Gaussian measurements 
are sufficient with very high probability, but 2400 Gaussian measurements 
are insufficient with very high probability. 

In this paper, we present a simple and explicit formula for the phase tran- 
sition curve in matrix recovery from Gaussian measurements. The formula 
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Figure 1: Data from typical Phase Transition experiment. Here N = 60, r = 
20, and the number n of Gaussian measurements varies. Note: our formula 
predicts an asymptotic phase transition at 6* = 0.6937, corresponding to 
n = 2497. And, indeed, the success probability is close to 1/2 at that n. All 
runs involved T = 20 Monte Carlo trials. 



arises in an apparently unrelated problem: matrix de-noising in Gaussian 
noise. In this problem, we again let Xq denote an M by matrix, and 
we observe Y = Xq + Z , where Z is Gaussian iid noise Zij ~ A/'(0, 1). We 
consider the following nuclear norm de-noising scheme: 

(Pnucx) min l^lir - X\\l + A||X||,} . (2) 

In this problem the measurements Y are direct, so in some sense complete, 
but noisy. The solution Xx{Y) can be viewed as a shrinkage estimator. In 
the basis of the singular vectors Uy and Vy of Y, the solution Xx(Y) is 
diagonal, and the diagonal entries are produced by soft thresholding of the 
singular values of Y. 

Because the measurements y in the matrix recovery problem are noiseless 
but incomplete, while the measurements Y in the matrix denoising problem 
are complete but noisy, the problems seem quite different. Nevertheless, we 
show here that there is a deep connection between the two problems. 

Let us quantify performance in the denoising problem by the minimax 
MSE, namely 

M(p; M, N) = min max MSE(Xq , XxiY)), 

A rank{X)<pN 

where MSE refers to the dimension-normalized mean-squared error 

and subscript F denotes Frobenius norm. The asymptotic minimax MSE 
M.{p; (3) = limAr_>oo A^(p; /3A^, A^) has been derived in [11]. Exphcit formulas 
for the curve p ^ Ai{p; (3) appear in the Appendix. A parametric form is 
given for the case of asymptotically square matrices, (3 = 1. Figures 2 and 3 
depict the various minimax MSE curves. 
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Minimax MSE for Nuclear Norm Denoising 




Asymmetric 

Positive Definite 



p 

Figure 2: The two asymptotic minimax MSB's M(p|X): X = Mjy (black), 
X = SymrriN (red), in the case of square matrices. For non square matrices, 
see curves in Figure 3 




Figure 3: Curves: Asymptotic Minimax MSB's for nonsquare matrix set- 
tings MatM,N] varying shape factor /3 = M/N. Points: Bmpirical phase 
transition locations for matrix recovery from incomplete measurements, see 
Table 8. 
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We can now state our main hypothesis for matrix recovery from Gaussian 
measurements. 

Main Hypothesis: Asymptotic Phase Transition Formula. Con- 
sider a sequence of matrix recovery problems with parameters {{r,n,M, A^)}Ar>i 
having limiting fractional rank p = limN^oor/N , limiting aspect ratio (5 = 
limN^oo M/N , and limiting incompleteness fraction 6 = limN-^oo n/lMN). 
In the limit of large problem size N, the solution Xi{y) to the nuclear norm 
minimization problem (Pnuc) is correct with probability converging to one if 
S > Ai{p; (3) and incorrect with probability converging to one if S < Ai{p', P)- 
In short: The asymptotic phase transition 6*{p,P) in Gaussian matrix recov- 
ery is equal to the asymptotic minimax MSE A4 (p; /3) . 

In particular, for the case of small rank r, by studying the small p 
asymptotics of Eq. (14), we obtain that reconstruction is possible from 
n > 2r(M + + \/MN){l + 0{r/N)) measurements, but not from sub- 
stantially less. 

This brief announcement tests this hypothesis by conducting a substantial 
computer experiment generating large numbers of random problem instances. 
We use statistical methods to check for disagreement between the hypothesis 
and the predicted phase transition. To bolster the solidity of our results, 
we conduct the experiment in two different settings: (1) the matrix Xq is 
a general M by iV matrix, for various rank fractions p and aspect ratios /3; 
(2) Xq is a symmetric positive definite matrix, for various rank fractions p. 
In the latter case the positive semidefinite constraint is added to the convex 
program (Pnuc)- As described below, there are different asymptotic MSE 
curves for the two settings. We demonstrate an empirically accurate match 
in each of the cases, showing the depth and significance of the connection we 
expose here. 

In the discussion and conclusion we connect our result with related work 
in the field of sparsity-promoting reconstructions, where the same formal 
identity between a minimax MSE and a phase transition boundary has been 
observed, and in some cases even proved. We also discuss recent rigorous 
evidence towards establishing the above matrix recovery phase transition 
formula. 

2 Methods 

We investigated the hypothesis that the asymptotic phase transition bound- 
ary agrees with the proposed phase transition formula to within experimental 
error. 

For notational simplicity we will focus here on the case M = N, and defer 
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the case of non-square matrices to the SI. Hence, we will drop throughout the 
main text the argument /3 = 1. The asymptotic phase plane at point {p,S) 
is associated to triples {r,n,N), where p = r/N G [0, 1] is the rank fraction, 
and 6 = 6{n,N\'X.) = n/dim(X.) is the under sampling ratio, where dim(X.) 
is the dimension of the underlying collection of matrices X. We performed a 
sequence of experiments, one for each tuple, in which we generated random 
rank-r N hy N matrices Xq e X, random measurement matrices A — A of 
size n x A^^, and obtained random problem instances {y,A). We then ap- 
plied a convex optimization procedure, obtaining a putative reconstruction 
X = X{y,A). We declared a reconstruction successful when the Frobenius 
norm was smaller than a threshold. Our raw empirical observations consist of 
a count of empirical successes and sample sizes at a selected set of positions 
(p, 5) and a selected set of problem sizes N. From these raw counts we pro- 
duce fitted success probabilities 7r{r\n, N,%), The finite-A^ phase transition 
is the place where the true underlying probability of successful reconstruction 
take the value 50%. We tested the hypothesis that the finite- transition 
was consistent with the proposed asymptotic phase transition formula. 
This section discusses details of data generation and data analysis. 

2.1 Generation of Problem Instances 

Each problem instance {y, A) was generated by, first, generating a random 
rank r matrix Xq, then, generating a random measurement matrix A — An^i^2 
and then applying y = A ■ vcc^Xq). 

We considered problem instances of two specific types, corresponding to 
matrices G X, with X one of two classes of matrices 

• Matjq: &]\ N X N matrices with real- valued entries 

• Sym^: a\\ N x N real symmetric matrices which are nonnegative- 
semidefinite 

In the case X = MoIn, we consider low-rank matrices Xq — UV where 
U and V are each N hy r partial orthogonal matrices in the Stiefel manifold 
St{N,r). The matrices are uniformly distributed on St{N,r). In the case 
X = Syrrij^, we consider low-rank matrices Xq = UU' where t/ is an iV by 
r partial orthogonal matrix in St{N,r), and again the matrix is uniformly 
distributed. 

For measurement matrices A, we use Gaussian random matrices satisfying 
Aj~iV(0,l/n). 
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2.2 Convex Optimization 



For a given problem instance {y,A), we attempt to recover the underlying 
low-rank object Xq from the measurements y by convex optimization. Each 
of our choices X gives rise to an associated optimization problem: 

(Pnuc) min subject to ?/ = A ■ t;ec(X), X G X. 

Here X is one of these two classes of matrices Mat^ or Syrrij^. The two re- 
sulting optimization problems can each be reduced to a so-called semidefinite 
programming problem; see [12, 13]. 

2.3 Probability of Exact Recovery 

Since both the measurement matrix A, and the underlying low-rank object 
Xq are random, {y,A) is a random instance for (-P^c)- The probability of 
exact recovery is defined by 

7r(r|n,X, X) = ProbjXo is the unique solution of (P^c)}- 

Clearly < vr < 1; for fixed N, it is monotone decreasing in r and monotone 
increasing in n. Also 7c{r\n, N, MoIn) < 7r(r|n, N, Sym^). 

2.4 Estimating the Probability of Exact Recovery 

Our procedure follows [14, 15]. For a given matrix type X and rank r we 
conduct an experiment whose purpose is to estimate 7r(r|n, X, X) using T 
Monte Carlo trials. In each trial we generate a random instance (y. A) which 
we supply to a solver for obtaining the result Xi. We compare the 

result Xi to Xq. If the relative error ||Xo — Xi||i;'/||Xo||ir is smaller than a 
numerical tolerance, we declare the recovery a success; if not, we declare it a 
failure. (In this paper, we used an error tolerance of 0.001.) We thus obtain T 
binary measurements Yi indicating success or failure in reconstruction. The 
empirical success fraction is then calculated as 



7r(r|n,X, T, X) 



^{successes} 



^{trials} 




i=l 



These are the raw observations generated by our experiments. 
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2.5 Asymptotic Phase Transition Hypothesis 

Consider a sequence of tuples {r,n,N) with r/N — > p and n/N — )■ 6. We 
assume that there is an asymptotic phase transition curve (5*(p|X), i.e. a 
curve obeying 

vrfrin iV X) ^ I ^ ^ < 5*(p|X) .3. 
7r(r|n,7V,Xj ^ I 5 > 5*(p|X) 

For many convex optimization problems the existence of such an asymptotic 
phase transition is rigorously proven; see the Discussion below. 

The hypothesis we investigate concerns the value of (5*(p|X); specifically, 
whether 

<5*(p|X)=X(p|X). (4) 

Here Ai{p\Mat) (respectively Ai{p\Sym) ) is the minimax MSE for SVT 
for general matrices (respectively, positive definite ones). Formulas for Ai 
were derived by the Authors in [11]; computational details are provided in 
the Appendix. 

2.6 Empirical Phase Transitions 

The empirical phase transition point is estimated by fitting a smooth function 
Tx{n/N) (in fact a logistic function) to the empirical data 7r(r|n, iV, X) using 
the glmO command in the R statistics language. In fact we fit the logistic 
model that logit(7r) = log(j^) = a + 6A, where A(5|p) = 5 — M{p) is the 
offset between 6 and the predicted phase transition. The coefficients a and b 
are called the intercept and slope, and will be tabulated below. The intercept 
gives the predicted logit exactly at A = 0, i.e. S = A^(p). The empirical 
phase transition is located at S{r, N, M, X.) = A4{p) — a/b. This is the value 
of 5 = 6{n, A^|X) solving 

Tr{5) = 1/2. 
Under the hypothesis (4) we have 

lim lim 5(r,iV,r,X) = A^(p|X). 

N^oo,r/N^p T-5-00 

Consequently, in data analysis we will compare the fitted values 6{r, N, T, X) 
with Mir/N\X.). 

2.7 Experimental Design 

To address our main hypothesis regarding the agreement of phase transition 
boundaries, we measure 71 at points 6 = n/N and p = r/N in the phase 
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plane {6, p) which we expect to be maximally informative about the location 
of the phase transition. In fact the informative locations in binomial response 
models correspond to points where the probability of response is in the middle 
range (1/10, 9/10) [16]. As a rough approximation to such an optimal design, 
we sample at equispaced 5 G \M{p\X) — 0.05, A^(p|X) + 0.05]. 

2.8 Computing 

We primarily used the MATLAB computing environment, and the popular 
CVX convex optimization package [17]. A modeling system for disciplined 
convex programming by Boyd, Grant and others, supporting two open source 
interior-point solvers: SeDuMi and SDPT3 [18, 19]. 

We also studied the robustness of our results across solvers. Zulfikar 
Ahmed translated our code into Python and used the general purpose solver 
package CVXOPT by Anderson and Vandeberghe [20]. 

3 Results 

Our experimental data have been deposited at [21] they are contained in a 
text file with more than 100,000 lines, each line reporting one batch of Monte 
Carlo experiments at a given r, n, M, N and X. Each line also documents the 
number of Monte Carlo trials T, and the observed success fraction vr. The file 
also contains metadata identifying the solver and the researcher responsible 
for the run. 

In all cases, we observed a transition from no observed successes aX 5 = 
M.{p) — 0.05 to no observed failures aX 5 = M.{p) + 0.05. Figure 3 shows 
results we obtained at non square matrix ensembles, with varying (5 = M/N. 
The minimax MSE curves M(p|/3) vary widely, but the observed PT's track 
the curves closely. 

Figure 4 shows a small subset of our results in the square case X = 
Mat AT, to explain our empirical results; the full tables are given in SI for 
the square, non square and symmetric positive definite cases. In the square 
case, the empirical phase transition agrees in all cases with the formula A^(p) 
to two digits accuracy. Table 7 shows that, in the symmetric nonnegative- 
definite case X = Syrrij^, the empirical phase transition falls within [Ai{p) — 
0.01, M(p) + 0.01]. 

Previous empirical studies of phase transition behavior in sparse recovery 
show that, even in cases where the asymptotic phase transition curve is 
rigorously proven and analytically available, such large-A^ theory cannot be 
expected to match empirical finite- data to within the usual naive standard 
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Figure 4: Results with X = Mat^, with p = 1/10 and Varying A^. T 
total number of reconstructions; a, b: fitted logistic regression parameters; 
Z: traditional Z-score of logistic regression intercept. See Table 6 for the 
complete table. 



errors [14, 15]. Instead, one observes a finite transition zone of width ^ 
Ci/N^^"^ and a small displacement of the empirical phase transition away from 
the asymptotic Gaussian phase transition, of size ~ C2/N . Hence, the strict 
literal device of testing the hypothesis that E5 = M{p) is not appropriate 
in this setting ^ 

A precise statement of our hypothesis uses the fitted logistic parameters 
a = d{r, N, M,X.) and b = b{r, N,T,X.), defined above. The asymptotic 
relation^ 

lim hm N,T,X.) ^ ^ 

N-^^T-^^ b{r,N,T,X) 

implies (5(p|X) = A^(p|X) . Now note that in Figure 4 the coefficient b 
scales directly with N and takes the value several hundred for large A^. This 
means that, in these experiments, the transition from complete failure to 
complete success happens in a zone of width < 1/100. Notice also that a 
stays bounded, for the most part between and 2. This means that the 
response probability evaluated exactly at M.{p) obeys typically: 

logit(7r(A^(p))e [0,2], 

Figure 5, Panel A presents the fitted slopes b and problem sizes A^,as well 
as an empirical fit. All the data came from Table 6, but we omitted results 
for p G {3/4,9/10} because those slopes were large multiples of all other 



^As shown in Figure 4, and in Tables 6, 7, 8, 9, and 10, our results in many cases do 
generate Z scores for the intercept term in the logistic regression which are consistent 
with traditional acceptance of this hypothesis. However, traditional acceptance is not 
needed, in order for the main hypothesis to be valid, and because of finite-A^ scaling 
effects indicated above, would not ordinarily be expected to hold. 

^ =p denotes convergence in probability. 
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30 40 50 60 70 80 90 100 40 50 60 70 80 90 
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Figure 5: Finite-A^ scaling effects. Fitted slopes b from Tables 6 and 7 
versus matrix sizes A^. Left Panel: asymmetric square matrices X = Mat at. 
Right Panel: symmetric nonnegative-definite matrices X = Symjy. Super- 
imposed lines have formulas b = 10 + 3N. Note: in Left Panel, data for 
p e {3/4, 9/10} were excluded because the slopes b were very large (700 and 
2700, respectively). 

slopes. Similarly Figure 5, Panel B presents the slopes from Table 7. In each 
plot, the line 6 = 6 + 3N is overlaid for comparison. Both plots show a clear 
tendency of the slopes to increase with N. 

The combination of linear growth of b with and non-growth of a 
implies Eq. (5), and acceptance of our main hypothesis. 

Nongaussian Measurements. This paper studies matrix recovery from 
Gaussian measurements of the unknown matrix, and specifically does not 
study recovery from partial entry wise measurements typically called 'matrix 
completion'. Entry wise measurements is yield a phase transition at a different 
location [9] . Our conclusions do extend to certain nonGaussian measurements 
based on A with independent and identically distributed entries that are 
equiprobable ±1 (a.k.a. Rademacher matrices). See Table 10. 

•^Actually, even sub linear growth of a implies the result. 
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4 Discussion: Existing Literature and Our 
Contributions 



Phase transitions in the success of convex optimization at solving non-convex 
problems have been observed previously. Donoho and Tanner considered 
linear programs useful in the recovery of sparse vectors, rigorously established 
the existence of asymptotic phase transitions, and derived exact formulas for 
the asymptotic phase transition [14, 22-25] as well as finite exponential 
bounds. Their work considered the recovery of fc-sparse vectors with 
entries from n Gaussian measurements. The phase diagram in those papers 
could be stated in terms of variables (k, 5), where, as in this paper, 6 = n/N 
is the under sampling fraction, and k = k/N is the sparsity fraction, which 
plays a role analogous to the role played here by the rank fraction p. The 
proofs in those papers were obtained by techniques from high-dimensional 
combinatorial geometry, and the formulas for the asymptotic phase transition 
were implicit, written in terms of special functions. 

Donoho, Maleki, and Montanari [26] later developed a new so-called Ap- 
proximate Message Passing (AMP) approach to the sparse recovery problem, 
which gave new formulas for phase boundaries, confirmed rigorously by Bay- 
ati and Montanari in [27, 28]. While the previous formulas involved com- 
binatorial geometry, the new (necessarily equivalent) ones involved instead 
minimax decision theory. An extensive literature on AMP algorithms has 
since developed see, e.g. [29, 30], implying, among other things, universality 
in the sparse recovery phase transition [31]. 

Donoho, Johnstone and Montanari [32] generalized previous AMP-based 
phase transition results to block-sparse, monotone, and piecewise constant 
signals. They provided evidence that the observed phase transition of 
the associated convex optimization reconstruction methods obeyed formulas 
of the form 

5*{k) = M{k), 0<k<1. (6) 

Here k is a variable measuring generalized sparsity and A^(k) the minimax 
MSE of an appropriate denoiser based on direct noisy measurements. The 
main result in this brief report fits in this general framework whereby the 
sparsity k is identified with the fractional rank p, and the minimax MSE 
symbol M applies to the singular value thresholding denoiser. Our main re- 
sult is therefore an extension of DJM-style formulas from the sparse recovery 
setting problem to the low rank recovery setting. 

Much mathematical study of matrix recovery [1, 3, 8, 33] has focused on 
providing rigorous bounds which show the existence of a region of success, 
without however establishing a phase transition phenomenon, or determining 
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its exact boundary. A relatively accurate result by Candes and Recht (CR) 
for the case of Gaussian measurements [34], implies that n > 3r{M + N)[l + 
0{r/N)] measurements are sufficient. Our formulas for the square case M = 
N show that 

M{p\MatN) ^ Qp, p->0. 

This agrees with the CR bound in the very low-rank case. However, our 
relation S*{p) = A^(p) is apparently noticeably more accurate than the CR 
formula at finite A^. Table 9 presents experiments where the rank is fixed 
at r = 1,2,3, or 4, and varies between 40 and 90. Even though in such 
cases the corresponding p = r/N is rather small, for example 1/90 in the 
case r = 1 and A^ = 90, the empirical PT in our experiments agrees much 
more closely with the nonlinear formula A4{p) than it does with the linear 
formula 6p. Also, in the non square case /3 7^ 1, A/1 ~ 2p(l + /3 + \/~^), which 
is strictly smaller than 6p for /3 < 1, so the CR formula is noticeably less 
accurate than the 6* = A4 formula in the non square case. 

Of the mathematical methods developed to identify phase transitions 
but which are not based on combinatorial geometry or approximate message 
passing, the most precise are based on the 'Escape Through the Mesh' (ETM) 
technique of Yoram Gordon [10, 35]. ETM was used to prove upper bounds 
on the number of Gaussian measurements needed for reconstruction of sparse 
signals by Stojnic [35] and for low-rank matices by Oymak and Hassibi [10]. 
In particular, [10] studies one of the two cases studied here and observes 
in passing that in the square case, ETM gives bounds that seemingly agree 
with actual phase transition measurements. Very recently, building on the 
same approach, a DJM-style inequality S*{k) < Ai{K) has been announced 
by Oymak and Hassibi for a wide range of convex optimization problems 
in [36], including nuclear norm minimization; [36] also presented empirical 
evidence for S*{p) = A4(p) in the square case X = Mat^. 

Our Contributions. This paper presents explicit formulas for the min- 
imax MSE of singular value thresholding in various cases, and shows that 
the new formulas match the appropriate empirical phase transitions in a 
formal comparison. Compared to earlier work, we make here the following 
contributions: 

• A Broad Range of Phase Transition Studies for non square, square, and 
symmetric nonnegative-definite matrix recovery from Gaussian mea- 
surements. We also made certain nonGaussian measurements and ob- 
served similar results. 

• A Broad Range of Prediction formulas. We make available explicit for- 
mulas for Minimax MSE in the square symmetric nonnegative-definite. 
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or asymmetric case, as well as non square cases. 

• Careful Empirical Technique, including the following: 

Reproducibility. Publication of the code and data underlying our 
conclusions. 

Validation. Matlab/CVX results were re-implemented in Python 
/ CVXOPT, with similar results. Code was executed on 3 different 
computing clusters, with similar results. 

Study of Finite-N -scaling. We studied tendencies of a,b as 
varies, and observed behavior consistent with our asymptotic phase 
transition hypothesis. Studies at a single N could only have shown 
that an empirical phase transition was near to a given theoretical 
curve, at a given A^, but not shown the scaling behavior with N 
that the main hypothesis properly involves. 

5 Conclusions 

For the problem of matrix recovery from Gaussian measurements, our exper- 
iments, as well as those of others, document the existence of a finite- A?^ phase 
transition. We compared our measured empirical phase transition curve with 
a formula from the theory of matrix denoising and observed a compelling 
match. Although the matrix recovery and matrix denoising problems are 
superficially different, this match evidences a deeper connection, such that 
mean squared error properties of a dcnoiscr in a noise removal problem give 
precisely the exact recovery properties of a matrix recovery rule in a noiseless, 
but incomplete data problem. 

This connection suggests both new limits on what is possible in the matrix 
recovery problem, but also new ways of trying to reach those limits. 
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A Asymptotic Minimax MSE Formula 

The following provides explicit formulas for the matrix denoising minimax 
curves J^{p, f3\Mat) and Ai{p\Sym) used above. Please see [11] for the 
derivations. Computer programs that efficiently calculate these quantities 
are provided in [21]. Let 



7+ 



27r7 



(7) 



where 7± = (l ± a/t)^, denote the complementary incomplete moments of 
the Marcenko-Pastur distribution [37]. Define 



M,(A; p,p) = p + p-pp+{l-p) [pA2+ 
+a(l - p) fP^(A2; 1) - 2AP^(A2; 1) + A2p^(A2; 0) 



with 7 = 7(p, p) = (p - pp)/(p - pp). 
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Case X = MatM,N- The minimax curve is given by Jli{p, P\Mat) = 
infA Ml (A; p, (5p). The following minimaxity interpretation is proved in [11] 

lim inf sup MSE{Xx,Xo) = M{p,P\Mat) . (9) 

rank{Xo)<pPN 

Case X = Sym^. The minimax curve is given by A^(/3|S'j/m) = infA Mi/2(A; p, 
The following minimaxity interpretation is proved in [1 1] 

lim inf sup MSE{X^,Xq) = M{p\Sym) . (10) 

rank(Xo)<pN 



Computing A^(p, /3|X). The map A h-> Ma(A;p, p) is convex. Solving 

(iMa/dA = we get that argmiuA M„(A; p, p) is the unique root of the 
equation 

A-ip,(A2; \) - P,(A2; 0) = . (11) 

The right hand side of (11) is decreasing in A and the solution is determined 
numerically by binary search. 

For square matrices (p = p) (8) can be expressed using elementary 
trigonometric functions. In [11] it is shown that 

M„(A; p, p) = p(2 - p) + (1 - p) [pA2+ (12) 
+a(l - p) (g2 (A) - 2Agi (A) + A2Qo (A)) ] . 

where 

2 

Qo(a;) = - / ^4 - x2 = 1 —y/i-x"^ n±an{ —^ — - ) 

■K J 2-K TT V4 - 

X 

2 

Ql{x) = - f xV^-x^ = — (4-a;2)3/2 
IT J 3n 

X 

2 

Q2{x) = — / x2v'4 - i:^ = 1 x\/A - x'^{x^ - 2) asin{-) 

TT 7 47r TT 2 

are the complementary incomplete moments of the Quarter Circle law. More- 
over 

argminAM„(A; p, p) = 2 ■ sin {0a{p)) , (13) 
where Oa{p) G [0,7r/2] is the unique solution to the transcendental equation 

» + c<,<(»).(l4c<,.^w)=^(l±^^. (14) 

which is a simplified version of (11). 
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Parametric representation of the minimax curves. For square ma- 
trices (p = p) the minimax curves A^(p, l|Mat) and A4{p\Sym) admit a 
parametric representation in the (p, A^) plane using elementary trigonomet- 
ric functions, see [11]. As 9 ranges over [0,7r/2], 



p{0) 
M{e) 



2p{e)-p^{e) 



7r/2 

{cot(e) ■ (1 - icos2(e))) 



+ -{i-P? 



[it - 26»)(- - cos{eY) + 



sin{20) 
12 



{cos{29) - 14) 



is a parametric representation of A^(p, l\Mat), and similarly 



M{e) 



e + {cot(e) ■ (1 - icos2(e))) - tt/2 



e + {cot(e) ■ (1 - |cos2(6»))) + 7r/2 

2p(e) - p\e) + 4p(e)(i - p(0))sin\e) 

2 r, 5 



(tt - 26»)(- - cos(eY) + 



sin{2Q) 
12 



(cos(26l) - 14) 



is a parametric representation of M.{^p\Sym). 
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Summary of Empirical Results 
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Figure 6: Results with X = MatN. 
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Figure 7: Results with X = Sym^. 



24 



R 
H 




A4( 






If 


z 


VM ■ N 




0.100 


0.241 


0.241 


0.020 


215.506 


0.086 


60.000 




0.125 


0.290 


0.290 


-0.037 


322.468 


-0.127 


64.000 




0.143 


0.323 


0.321 


0.468 


238.477 


1.813 


70.000 


1 /4 


0.167 


0.365 


0.362 


0.884 


247.690 


3.109 


60.000 




0.200 


0.421 


0.422 


-0.153 


246.541 


-0.597 


60.000 




0.250 


0.498 


0.496 


0.339 


214.134 


1.406 


64.000 




0.333 


0.610 


0.607 


0.856 


284.629 


2.834 


60.000 




0.500 


0.788 


0.786 


0.635 


297.803 


2.124 


64.000 




0.100 


0.255 


0.255 


-0.105 


269.362 


-0.395 


51.962 




0.143 


0.340 


0.340 


0.031 


210.274 


0. 133 


48.497 




0.167 


0.383 


0.381 


0.373 


169.227 


1.733 


51.962 


J-/0 


0.200 


0.440 


0.437 


0.589 


197.336 


2.467 


51.962 




0.250 


0.517 


0.517 


0.113 


218.479 


0.470 


65.426 




0.333 


0.629 


0.626 


0.661 


219.930 


2.584 


51.962 




0.500 


0.801 


0.799 


0.572 


274. 122 


2.036 


55.426 




200 


475 


474 


127 




616 


42 426 




0.250 


0.554 


0.553 


0.258 


201.036 


1.114 


46.256 




0.286 


0.604 


0.603 


0.307 


167.966 


1.434 


49.497 




0.333 


0.665 


0.662 


0.486 


155.704 


2.327 


42.426 




Q 4Q{) 


0.738 


0.737 


0.292 


184.091 


1 .308 


42.426 




0.500 


0.827 


0.825 


0.460 


228.287 


1 .826 


45.255 




0.667 


0.930 


0.927 


0.729 


248.019 


3.073 


42.426 




0.100 


0.296 


0.295 


0. 146 


1Q1 Q-"in 
± y J- . you 


0.646 


38.730 




0.125 


0.352 


0.351 


0.242 


219.767 


1.000 


61.968 




0.143 


0.389 


0.388 


0.105 


179.344 


0.484 


64.222 


3/5 


0.167 


0.436 


0.433 


0.425 


198.788 


1.814 


46.476 


0.200 


0.495 


0.494 


0.237 


146.409 


1.196 


38.730 




0.250 


0.575 


0.573 


0.361 


164.339 


1.704 


46.476 




0.333 


0.686 


0.684 


0.509 


258.045 


1.886 


46.476 




0.500 


0.843 


0.842 


0.153 


156.137 


0.761 


46.476 




0.100 


0.305 


0.306 


-0.168 


221.047 


-0.697 


36.742 




0.125 


0.363 


0.362 


0.064 


151.402 


0.321 


39.192 




0.143 


0.401 


0.399 


0.263 


142.052 


1.346 


34.293 


2/3 


0.167 


0.448 


0.446 


0.271 


157.783 


1.323 


36.742 


0.200 


0.509 


0.510 


-0.209 


156.201 


-1.034 


36.742 




0.250 


0.589 


0.587 


0.315 


152.493 


1.666 


39.192 




0.333 


0.700 


0.697 


0.408 


190.613 


1.768 


36.742 




0.500 


0.854 


0.853 


0.257 


232.865 


1.034 


39.192 




0.100 


0.317 


0.317 


-0.043 


163.597 


-0.207 


34.641 




0.125 


0.376 


0.375 


0.237 


192.933 


1.047 


55.426 




0.143 


0.415 


0.412 


0.562 


178.303 


2.476 


48.497 


3/4 


0.167 


0.463 


0.461 


0.304 


174.599 


1.406 


41.569 


0.200 


0.525 


0.524 


0.169 


120.469 


0.886 


34.641 




0.250 


0.606 


0.604 


0.276 


154.259 


1.353 


41.569 




0.333 


0.716 


0.714 


0.378 


180.963 


1.682 


41.569 




0.500 


0.867 


0.866 


0.341 


283.911 


1.227 


41.569 




0.100 


0.324 


0.323 


0.220 


174.269 


1.019 


44.721 




0.126 


0.384 


0.381 


0.423 


160.001 


1.999 


44.721 




0.143 


0.423 


0.422 


0.273 


248.331 


1.056 


62.610 


4/5 


0.167 


0.472 


0.472 


-0.025 


190.094 


-0.112 


40.249 


0.200 


0.534 


0.533 


0.204 


195.806 


0.889 


44.721 




0.250 


0.616 


0.616 


-0.076 


197.138 


-0.333 


44.721 




0.333 


0.726 


0.725 


0.213 


203.942 


0.910 


40.249 




0.500 


0.875 


0.873 


0.412 


215.351 


1.689 


44.721 



Figure 8: Results with non square matrices X = MatM,N- Each row based 
on T = 400 Monte Carlo trials. 
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Figure 9: Results with low rank square matrices X = MatN,N- Each row 
based on T = 400 Monte Carlo trials. 
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Figure 10: Results with Rademacher measurements of square matrices X = 
Matpf^N. Each row based on T = 400 Monte Carlo trials. 
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C Data Deposition 

The data have been deposited in a text file at [21]. A typical fragment of the 
file is given here: 



Line 


Project 




Experiment 


M 


N 


S 


Instance rank rho 




delta 




ErrO Errl Err2 


3381 


Nuc_ 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


m 4 





.333333333333333 


0, 


73396061728395 0.( 


D16630719182629 0.444444444444444 


3382 


Nuc_ 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


in 4 





.333333333333333 


0, 


739213775178687 





.0159010475527301 0.465277777777778 


3383 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


m 4 





.333333333333333 


0, 


744476933073424 





.0161172486497232 0.416666666666667 


3384 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


749740090968161 





.00149080158529591 1 1 


3385 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


m 4 





.333333333333333 


0, 


765003248862898 





.0340839945130298 0.173611111111111 


3386 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


760266406757635 





.0235186066093925 0.361111111111111 


3387 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


765529564652372 





.0136400464216757 0.506944444444444 


3388 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


770792722647109 


1 


. 11644848368808e-09 1 1 


3389 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


776065880441845 





.00194118165102407 1 


3390 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


781319038336582 





.0059943065062774 0.902777777777778 


3391 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01II1 


12 


12 


1 


ra 4 





.333333333333333 


0, 


786582196231319 


1 


.83878516274726e-09 1 1 


3392 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


791845354126066 


7 


.85226405646261e-09 1 1 


3393 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


797108512020793 


4 


.47138355029567e-10 1 1 


3394 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


80237166991553 


l.( 


D3566257175607e-08 1 1 


3395 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


807634827810266 


2 


.62954112892325e-09 1 1 


3396 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


812897985705003 





.00120410713874671 1 1 


3397 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


81816114369974 


1.1 


33259685844506e-08 1 1 


3398 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


823424301494477 


2 


.522026416190Sle-10 1 1 


3399 


Nuc. 


.CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01in 


12 


12 


1 


ra 4 





.333333333333333 


0, 


828687459389214 


1 


.94437637948599e-09 1 1 


3400 


Nuc. 


-CVX. 


.20121129dii 


Nuc. 


.CVX. 


.N12S01m 


12 


12 


1 


ra 4 





.333333333333333 


0, 


.83396061728395 


1.1 


39829115472996e-10 1 1 



The fields have the following meaning 

• Line - Line number in file; in the above example, lines 3381-3400. 

• Project - File identifier - allows identification of code and logs that 
generated these data; in the above fragment, 'Nuc_CVX_20121129dii ' . 

• Experiment - File identified - allows identification of code and logs 
that generated these data; in the above fragment 'Nuc_CVX_N12S01m' . 

• M,N - matrix size of Xq, i.e M hy N matrix; in the above fragment 
M = N = 12. 

• S - number of matrices in a stack (see below) 

• Instance - alphabetic code a-t, identifying one of 20 identical runs which 
generated this result; in the above fragment, 'm'. 

• rank - integer rank of matrix; in the above fragment 'm'. 

• rho - fraction in [0, 1], p = rank/A^; in the above fragment, 1/3. 

• delta - 6 = n/{MN) in case of asymmetric matrix, ot 6 = 2*n/{N{N+ 
1)) in case of symmetric matrix. 

• ErrO - ||X - XoWp/iNMy/^ 

• Errl - 1 iff ||X — Xo||F/||Xo||i? < tol, and otherwise 

• Err2 - fraction of entries with discrepancy \X{i,j) — Xo(z, j)| < tol. 
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Additional concepts: 

• Numerical Tolerance. In our experiments, we used a numerical error 
tolerance parameter tol = 0.001. 

• Our experiments also extensively covered cases where Xq is a 'stack of 
matrices', i.e. a 3-way array M x N x S, where S is the number of 
items in the stack. The only cases of interest for this paper are S = 1. 

D Code Deposition 

There are two types of code deposition. 

• Reproduction from Data Deposition. The code that actually makes the 
figures and tables we presented in this paper, starting from the data 
deposition. This is deposited at [21]. The code we actually ran to 
create our figures and tables is a set of R scripts, and was run on a 
Mac OS X. We believe the same code runs with minimal changes on a 
LINUX environment. 

• RunMyCode Deposition. For readers who wish simply to compute the 
value of the Minimax Mean-Squared Error over each of the matrix 
classes we considered, we offer a Minimax MSE calculator at Run- 
MyCode. org. 

• Full Code and Results Deposition. At [21], we also offer a literal dump 
of the code we ran and all the logs and result files we obtained. 

We believe the first two items are self-documenting. The third item can 
be explained as follows. Our database of the experiment and all results is 
contained in a unix directory tree rooted at exp. 

Inside directory exp one finds further directories, as indicated below: 



Nuc. 


.cvx. 


.20120605a 


Nuc. 


.cvx. 


.20120607a 


Nuc. 


.CVX. 


.20121107a 


Nuc. 


.CVX. 


.20121113a 


Nuc. 


.cvx. 


.20121120a 


Nuc. 


.cvx. 


.20121121a 


SDP. 


.cvx. 


.20121230h 
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SDP_CVX_20121230i 

Nuc_CVX_20130110a 
Nuc_CVX_20130110c 

Nuc_CVX_20130117Cc 
Nuc_CVX_20130117Cd 

Nuc_CVX_20130126Df 
Nuc_CVX_20130126Dg 
Nuc_CVX_20130126Dh 

These directory names are precisely the Project names used in the data de- 
position. Say we look inside one of these directories, for example 'Nuc_CVX_20121121b' 
We will find a directory called bin containing software, and a list of further 
directories. We excerpt from a 2-column listing of those directories: 



Nuc. 


.cvx. 


.N05Sla 


Nuc. 


.cvx. 


.N30S1C 


Nuc. 


.cvx. 


.N05Slb 


Nuc. 


.cvx. 


.N30Sld 


Nuc. 


.CVX. 


.N05S1C 


Nuc. 


.cvx. 


.N30Sle 


Nuc. 


.cvx. 


.NlOSlj 


Nuc. 


.cvx. 


.N35S11 


Nuc. 


.cvx. 


.NlOSlk 


Nuc. 


.cvx. 


.N35Slm 


Nuc. 


.cvx. 


.NlOSll 


Nuc. 


.cvx. 


.N35Sln 


Nuc. 


.cvx. 


.N15Slh 


Nuc. 


.cvx. 


.N40Slj 


Nuc. 


.cvx. 


.N15Sli 


Nuc. 


.cvx. 


.N40Slk 


Nuc. 


.cvx. 


.N15Slj 


Nuc. 


.cvx. 


.N40S11 


Nuc. 


.cvx. 


.N20S1O 


Nuc. 


.cvx. 


.N45Slq 


Nuc. 


.cvx. 


.N20Slp 


Nuc, 


-CVX_ 


_N45Slr 


Nuc. 


.cvx. 


.N20Slq 


Nuc. 


.cvx. 


.N45S1S 


Nuc. 


.cvx. 


.N25Sln 


Nuc. 


.cvx. 


.N50Slp 


Nuc. 


.cvx. 


.N25S10 


Nuc. 


.cvx. 


.N50Slq 


Nuc. 


.cvx. 


.N25Slp 


Nuc. 


.cvx. 


.N50Slr 



These directory names are precisely the 'Experiment' values seen in the 
data deposition. Inside the bin directory we find the matlab code used in 
common by all the above experiments. 
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Nuc_CVX_20121121b$ Is bin 
aveRank . m solveNuc_CVX_Stack_Arb . m 
predNucPT.m stackRankRMatrix.m 
rankRMatrix . m 

Inside one of the experiment directories wc find experiment-specific files (in 
both Matlab and Bash script) as well as output files. The subdirectory logs 
contains logs that were created while the jobs were running. 

Nuc_CVX_20121121b $ Is -R -CI Nuc_CVX_N25Slo 

bashMain. sh 

PTExperiment .m 

matlabMain.m 

results .mat 

randState .mat 

Nuc_CVX_N25Slo/logs : 

Nuc_CVX_N25Slo . stderr 

Nuc_CVX_N25Slo . stdout 

runMatlab . 20121121b_Nuc_CVX_N25Slo . log 

Here the .mat files were produced by matlab during the running of the ex- 
periment. 

• randState .mat preserves the state of the random number generator at 
the beginning of that experiment. 

• results .mat gives the results of the individual problem instances; typ- 
ically in the form (r, n, N, M, ErrO, Errl, Err2). 
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